# **Prototyping Formal System Models with Active Objects**

Eduard Kamburjan

Technische Universität Darmstadt, Germany kamburjan@cs.tu-darmstadt.de Reiner Hähnle Technische Universität Darmstadt, Germany haehnle@cs.tu-darmstadt.de

We propose active object languages as a development tool for formal system models of distributed systems. Additionally to a formalization based on a term rewriting system, we use established Software Engineering concepts, including software product lines and object orientation that come with extensive tool support. We illustrate our modeling approach by prototyping a weak memory model. The resulting executable model is modular and has clear interfaces between communicating participants through object-oriented modeling. Relaxations of the basic memory model are expressed as self-contained variants of a software product line. As a modeling language we use the formal active object language ABS which comes with an extensive tool set. This permits rapid formalization of core ideas, early validity checks in terms of formal invariant proofs, and debugging support by executing test runs. Hence, our approach supports the prototyping of formal system models with early feedback.

# **1** Introduction

Formal methods provide formal frameworks for software and systems development, including formally defined specification and programming languages. Their aim is to support design and implementation of engineering projects with high quality requirements, yet formal notations themselves are developed without the very support they are intended to provide. This is not simply an issue of productivity, but of usability: one of the largest obstacles against the uptake of formal methods is that they are expressed in—occasionally dated—formalisms that are hard to communicate, to understand, and to validate.

In this paper, we intend to show that recent progress in *software engineering*, including new structuring principles as well as state-of-art tool support, can be beneficial in *formal methods engineering* as well. Specifically, we use a modern active object language instead of a term-rewriting system to formalize semantics, we use software product lines (SPL) [23] to organize and maintain different variants of the formal model, and we use automated theorem proving tools to provide early validity checks of the consistency of the formalized system.

We exemplify our approach with a representative case study in the area of distributed systems, where we look at certain weak memory models. The latter became very widespread, but their consequences are notoriously hard to understand. Currently, weak memory models are generally formulated as a term rewriting system for a small step operational semantics (SOS) [5, 19], as abstract automata [25, 27], or as axiomatic/algebraic description of traces [6, 24]. The advantage of these formalisms is to give the modeler freedom to adjust the formal semantics so it matches the underlying intuition and to formulate properties of interest without any restriction. One disadvantage is that the resulting formal models are hard to understand for anyone who is not an expert in the used formalism. Even for experts such formal models tend to be hard to validate (i.e. debug), because limited tool support is available. Available tools (e.g., rewriting engines like Maude [7]) lack the usability and modularity embodied in more software-oriented tools. Finally, successful usage of currently available tools requires that the formalization of the target system is essentially finished, hence are not suitable for *early* prototyping *during* formalization.

In the following we demonstrate that active object modeling is adequate for the domain of weak memory models, and that it can offer significant tool support during development, analysis, and presentation of an operational semantics. While weak memory models can be regarded as representative and sufficiently complex, our approach is in no way limited to this particular domain.

In our approach, an active object model is developed simultaneously with an operational semantics. As a consequence, the modeler is able to profit from debugging techniques, tool support, and best practices of software engineering. This includes support for modularization of models, debugging by means of test runs, as well as automated proofs of invariants. We make use of four software engineering principles: (1) *modularization* with interfaces and modules, (2) *variant management* with software product lines, (3) *validation* by execution tests and formal verification, (4) development by *early prototyping* to obtain feedback and experience with the product before creating its final version. These concepts permit to develop formal models faster, in an interactive manner, and with higher confidence in their properties.

Our main contributions are: (1) a weak memory model, formalized in the Abstract Behavioral Specification (ABS) language [15]; (2) a discussion of the advantages of developing a prototype in an active object language in parallel to the actual formalization. The paper is organized as follows: Section 2 describes similar approaches and compares to other methods for mechanizing formal models. Section 3 introduces ABS. Section 4 describes the formal model and its implementation, Section 5 describes the model validation, and Section 6 discusses the advantages of our approach. Section 7 concludes.

# 2 Related Work and Discussion of Tool-Based Approaches

There is a recognized need for tool support to understand and analyze relaxed memory models in mainstream programming languages. This led to the implementation of simulators that are capable to explore the configuration states generated from a given semantics, see Boudol et al. [5] in Java and Sarkar et al. [25] in OCaml. In either case the simulator is far larger and more complex than the underlying semantics. The simulators are optimized for the generation of configuration states. Using general purpose languages as state generators has two main downsides: (1) It obliterates the differences between code expressing the model and code needed for a framework to execute it, especially if the program has to be optimized. Reasoning about the model in terms of the simulation program is hard, because it includes reasoning about the framework. (2) The simulators are not intended to be used to *communicate and reason* about the semantics, only to discover interesting configurations, the modularization and structure they may have is not transferable to the model.

Traditional tools used in work on formal semantics include term rewriting engines such as Maude [7], CafeOBJ [21], theorem provers like Isabelle/HOL [22], and model checkers like SPIN/Promela [14]. These approaches allow to state theorems about the model and often have support for executability and modularization through namespaces. However, they are mostly used as a more precise alternative to pen-and-paper definitions and suffer from the similar downsides: (1) they provide no additional modularity or interface abstractions—they do not *simplify* reasoning about a model, they merely *formalize* it. For example, Weber [31] provides an Isabelle formalization of the memory model of Mantel et al. [19] and finds notational errors that do not compromise the theoretical results, but hamper comprehensibility. These mistakes were discovered, because the tool enforces syntax checks, not because the formalization itself would provide a clearer structure. (2) traditional formalization tools are not designed for *prototyping* formal methods: i.e. to present and verify the core ideas of a semantic *before* fleshing it out. In order to validate formalization ideas by performing tool-based integrity checks or to ensure that the system behaves as intended, the full model needs to be formalized. Some provers like Coq [30] allow to assume

some properties without proving them for prototyping, but they already require the structure of the final model to be fixed. Finally, (3) current tools offer no specific support for the challenges of distributed systems and they do not help to manage different variants of a formal semantics. Libraries like K [26] provide support for operational semantics, but to not address distributed systems or modularity. This is a general drawback of general purpose tools: Prototyping relies on fast feedback cycles and general purpose approaches need libraries, which hamper analyzability or auxiliary code, which hampers clarity.

Active object languages like ABS [15] and Rebeca [29] try to combine the advantages of mainstream programming languages (executability, fine-grained modularization) with the advantages of languages with formal semantics (formal verification) and new ideas from programming language research (e.g., variability management through feature-based modeling [23]) to simplify working with concurrency. The enforced structure of actors and objects is too restrictive to use them to analyze and express *all* desirable global properties, but their tool support and clarity predestines them to prototype models of distributed systems: fast validation of core ideas instead of full verification of the complete models. Active object languages are used to formalize a vast range of distributed systems: software systems [18], cyber-physical systems [17], operational procedures [16], and hardware [10, 28]. We argue that distributed formal system models can as well be modeled with active objects and that this results in faster development and more efficient communication.

# **3** Abstract Behavioral Specification

We provide a very short introduction to the *Abstract Behavioral Specification* (ABS) language. For its formal semantics and a tutorial see Johnsen et al. [15], Hähnle [12], Muschevici et al. [20], Din et al. [9].

#### 3.1 Language

ABS extends the actor [13] concurrency model with futures [3] and cooperative scheduling: Objects communicate with each other only over *asynchronous* method calls. Following a method call, the caller receives a future as a handle for identifying the started process and continues execution. The callee *resolves* the future by storing its return value in it. An object may only switch the active process if the currently active process explicitly releases control. This greatly simplifies the concurrency model, as between the synchronization points a method can be regarded as sequential. A process releases control by either terminating or suspending. The latter means either to wait for a future to be resolved or for a condition to become true. Once a future is resolved, any process that possesses a reference to it may read it, because futures can be passed around. When a process attempts to read from a non-resolved future, the whole object blocks until the future is resolved.

ABS models can be compiled into Erlang, Haskell or Java code and be executed. Syntactically, ABS is close to Java and has similar concepts, including interfaces and classes. However, classes have no static fields and may not extend other classes. In addition, all objects are strongly encapsulated: the only way to access their state is via getter and setter methods they may declare. Modularization is supported by syntax modules, akin to Haskell, that allow to import and export classes, interfaces, and functions. The code in Figure 1 declares a module with a simple container model. We refrain from introducing the full ABS syntax, as most statements are standard. We only describe the statements specific to the concurrency model.

• To invoke (asynchronously) a method m on the object stored in o, the statement f = o!m(i) is used. The value of i is passed as the method parameter, and the resulting future is stored in f. The

```
1 module Container;
2 export *;
3 import Element from Element;
4 interface Container {
5
    Unit setElement(Element e);
    Element getElement();
6
7 }
8 class Container implements Container{
    Element contains = null;
9
    Unit setElement(Element e) { contains = e; }
10
   Element getElement() { return contains; }
11
12 }
```

Figure 1: A simple Container in ABS.

type of the future must match the return type of the method.

- There are two statements to synchronize with a future stored in f: (1) f.get attempts to synchronize on f by reading its value. If f is not resolved yet, then the process blocks until then: no other process may become active. (2) await f? releases control of the object until f is resolved, i.e., another process may become active.
- To wait on a condition, the statement **await** b releases control of the processor until the boolean expression evaluates to true. The behavior is scheduler dependent, as the expression may evaluate to false again, if another process with such a side-effect is scheduled first.

We abrreviate the pattern Fut<T> f = o!m(); await f?; i = f.get; with i = await o!m(); and write foreach (i < E) { ... } for Int i = 0; while (i < E) { ...; i = i + 1; }. If a class has a method with the signature Unit run(), then this method is started automatically upon object creation.

ABS is object-oriented, but does not enforce to model everything as an object – the enforced asynchronous communication leads to overhead for simple look-up operations. To omit this, ABS uses *Abstract Data Types* to abstract from data values which have no internal state.

### 3.2 Product Lines

ABS offers management of model variants via product lines. A product line describes different versions of a model that are obtained by certain syntactic operations on a common core. These syntactic operations are called *deltas* and they are able to add, as well as replace classes, methods, and fields. The deltas are applied before type checking. When a method is replaced in a delta, then the new version of the method may refer to the previous one with the keyword **original**. The Notify delta in upper part of Figure 2 modifies the setElement method, such that the element in the container is notified after being added and the delta Queue replaces the single contained element by a queue.

Product lines associate deltas with *features* and describe constraints among their application order. The Notify delta must be applied after Queue, because only that version of setElement calls original. This is achieved by the **after** directive in the lower part of Figure 2.

```
1 delta Notify;
    modifies class Container.Container {
2
       modifies Unit setElement(Element e) {
3
         original();
4
         await = e!observedBy(this);
5
       }}
6
7 delta Queue;
    modifies class Container.Container {
8
      removes Element contains;
9
10
       adds List<Element> contains = Nil;
       modifies Unit setElement(Element e) { contains = Cons(e, contains); }
11
       modifies Element getElement() {
12
         Element e = value(contains);
13
         contains = tail(contains);
14
         return e;
15
16
      }}
```

```
1 productline ContainerElement;
2 features QueueF, NotifyF;
3 delta Notify after Queue when NotifyF;
4 delta Queue when QueueF;
5 product NotifyProduct(NotifyF);
6 product FullProduct(QueueF, NotifyF);
```

Figure 2: Deltas and a product line in ABS.

### 3.3 Logic

ABS offers *invariant*-based reasoning to prove consistency and safety properties of single objects. Invariants are expressed as formulas in a first-order axiomatization of heap memory in a program logic and must hold at every release point, i.e. at the end of each method and whenever an **await** statement is reached. ABS methods are integrated into the logic by *modalities* over ABS statements. A calculus for showing validity of formulas in this logic is built into the theorem prover KeY-ABS [8]. It can verify invariants semi-automatically by *symbolic execution* of ABS statements. Symbolic execution can be used to compute formulas that correspond to symbolic state transformers reflecting the state changes caused by a given method.

The heap is axiomatized with functions to *select* and *store* values to/from a reserved variable *heap*. To express that a field i is a list containing only positive values one may use the following formula  $\varphi$ :

 $\varphi = \forall \text{ Int } k. \ k \ge 0 \land k < length(select(heap, self, i)) \rightarrow select(heap, self, i)[k] > 0$ 

Strong encapsulation is reflected in the signature: each formula may only reason about the fields of one class. The following expresses that  $\varphi$  is an invariant for a given piece of code: if it holds in the beginning, then it holds after executing i = Cons(10,i);

$$arphi 
ightarrow [ extsf{i}$$
 = Cons(10,i);] $arphi$ 

# 4 A Weak Memory Model

We present a simple weak memory model<sup>1</sup> that is able to simulate instruction reordering and write atomicity violation, i.e. the two main principles of weak memory models [1]. Our model is based on ideas taken from Boudol et al. [5], where identifiers—similar to the futures in ABS—are used and from Mantel et al. [19], where relaxation of the instruction order is characterized by pair-wise comparisons on the queue of memory accesses. To keep this paper within reasonable length, we give no examples for fences and visibility beyond read-own-write.

### 4.1 Memory

We start with the Memory interface that models system memory. Our core product variant is a conventional memory and does not allow instruction reordering. Its central concept is the Map<Location, Int> mem field that maps location to values. The client interface is:

```
interface Memory{
  Fut<Int> read(Thread t, Location loc);
  Unit write(Thread t, Location loc, Int val);
  Int const(Int i);
  }
```

The read/write methods model reading/writing a value from/to the memory and const returns a constant. Internally, memory access is managed by a list of waiting accesses and each call to read or write adds one access to that list. A single memory access has the following type:

Each memory Access has a unique id parameter. After adding an Access, the write and read methods terminate. The read method returns the future of a call to internalRead, which is resolved once the access has been executed. I.e. it waits until the id of the corresponding Access is added to the done set and then returns the read value, saved in the map ret.

```
1 Int read(Thread t, Location loc) {
    Int myId = counter; // global access counter
2
    list = appendright(list, Read(t, loc, myId));
3
    counter = counter + 1;
4
    return this!internalRead(myId);
5
6 }
7 Int internalRead(Int myId) {
8
    await contains(done, myId);
    return lookupUnsafe(ret, myId);//Note: value is not read from mem
9
10 }
```

The memory is modeled with a loop that first waits until the list of scheduled Access items is not empty, see Figure 3. Then it invokes a strategy which returns the positions of those accesses that can be safely executed next. Afterwards, it gets and removes the scheduled access from the list, and executes its effect. The value snd(pp) is obtained from a call to getValueFor and used in the read. In case of a write, the given value is written to memory. The strategy method implements the procedure defined in Mantel

<sup>&</sup>lt;sup>1</sup>The model is available at http://formbar.raillab.de/en/publications-and-tools/memory-model

et al. [19]: At each position *i* of list, the access at position *i* is compared to all accesses on positions j < i. The maySwap method takes two accesses and decides whether the second access can be executed before the first one. In the core product, i.e. under sequential consistency, it returns False if and only if the two accesses are from the same thread. The getValueFor method simply reads from memory.

```
1 Unit run() {
    while (True) {
2
       await list != Nil;
3
      List<Int> accList = this.strategy();
4
      Pair<Access, Int> pp = this.getAccess(cextPos(accList)); // invokes getValueFor
5
6
      case fst(pp) {
        Write(t, loc, val, id) =>
7
           { done = insertElement(done, id); mem = put(mem, loc, val); }
8
9
         Read(t, loc, id)
                                  =>
           { done = insertElement(done, id); ret = put(ret, id, snd(pp)); }
10
       }
11
    }
12
13 }
14 List<Int> strategy() {
    List<Int> allowed = Nil;
15
   foreach (i < length(list)) { // prettified</pre>
16
17
      Bool add = True;
      foreach (j < i) { // prettified</pre>
18
         Bool b = this.maySwap(nth(list,j), nth(list,i));
19
         if (!b) { add = False; }
20
       }
21
      if (add) { allowed = Cons(i, allowed); }
22
23
    }
24
    return allowed;
25 }
26 Int getValueFor(Thread tid, Location loc, Int pos) {
27
    return lookupUnsafe(mem, loc);
28 }
```

Figure 3: The next access loop and the access scheduling strategy.

#### 4.2 Threads

We aim to describe a memory model, not a whole programming language, hence threads are series of memory invocations. We do not model control flow or make assumptions about specific properties of the local store. A memory invocation can have one of the following forms:

- 1. await mem!write(this,Location(name),val); models a write access. Observe that after await execution continues once the Access is added to the list, not when the Access is executed.
- 3. reg.get; models an access where the value is retrieved, i.e. a computation.

The **await** statements enforce a FIFO treatement of *adding* memory accesses to the queue, from the point of view of a single process. The future returned by read is then used to synhronize on the *execution* of the added memory access. This is needed for two reasons: (1) Memory systems are synchronous by their nature and it is necessary to enforce synchronicity *at this point*, despite using a asynchronous language. (2) Additionally to asynchronous communication, ABS does not enforce FIFO scheduling, i.e., the callee object is not required to process the calls in the order they arrive.

### 4.3 Instruction Reordering

To include instruction reordering into our model, we define the predicates for program order relaxation described by Mantel et al. in terms of deltas (Section 3.2). Each relaxation predicate adds a condition in the maySwap method. For example, to permit write-read reordering, one applies the following delta:

```
1 delta WRDelta;
2 modifies class Mem.Memory {
    modifies Bool maySwap(Access a, Access b) {
3
      Bool ret = False;
4
5
      case a {
        Read(_,_,_)
                         => { ret = original(a,b); }
6
7
        Write(_,loca,_,_) => case b {
           Read(_,locb,_) => { last = original(a,b);
8
                                ret = (last || loca != locb); }
9
          Write(_,_,_) => { ret = original(a,b); }}
10
      }
11
12
      return ret;
13
    }}
```

This method checks whether the arguments are write and read accesses on different locations. The default is a call to **original**. The first access for each thread may be executed first, because the core product models the condition that accesses from different threads can always be reordered and **original** is called in each case. The remaining three reorderings are defined in an analogous manner.

### 4.4 Write-Atomicity Violation

To allow read-own-write, the getValueFor method must be changed: When executing a read at position pos, then it must check whether there is a write *from the same thread* in the access list before pos. This corresponds to local buffers for each thread. We refrain from modeling more fine-grained visibility for read-others-early for presentation's sake.

```
1 delta ReadEarlyDelta;
2 modifies class Mem.Memory {
3 modifies Int getValueFor(Thread tid, Location loc, Int pos) {
4 return case getWriteValueFor(slice(list, 0, pos-1), loc, tid) {
5 Just(val) => val;
6 Nothing => original(loc, pos);
7 };
8 }}
```

Method maySwap must now allow write-read reordering when both arguments access the same location. An additional delta WROwnDelta does this. It is identical to WRDelta, except that the statement in Lines 8-9 is if (loca==locb) ret = True; else ret = original(a,b);. Different memory models are modeled as product variants. Each memory model is obtained from a set of conditions for instruction reordering and the product corresponding to that model results from the application of appropriate deltas. The resulting product line is in Figure 4.

```
1 productline Memory;
2 features WWFeature, WRFeature, ReadEarlyFeature;
3 delta WWDelta when WWFeature;
4 delta WRDelta after WROwnDelta when WRFeature;
5 delta ReadEarlyDelta when ReadEarlyFeature;
6 delta WROwnDelta when ReadEarlyFeature;
7 product TSO (WRFeature, ReadEarlyFeature);
8 product PSO (WRFeature, WWFeature, ReadEarlyFeature);
9 product IBM370 (WRFeature);
```

Figure 4: Declaration of the weak memory product line, following the terminology of Mantel et al. [19].

# 5 Model Validation

The capability to *validate* design decisions made during formalization at an early stage can save a lot of work and helps to improve trust. We exemplify the validation with our model by three basic integrity checks: (1) the weakened model can produce traces that are only obtainable in a relaxed, not in a conventional, memory model; (2) the memory model cannot deadlock, provided that the threads using it do not deadlock, and (3) the modeling of ids is consistent.

### 5.1 Instruction Reordering

A classical litmus test, taken from Boudol et al. [5], is the following simplification of Dekker's algorithm:

```
v = 1; if w == 1 then ... run in parallel with w = 1; if v == 1 then ...
```

This program can result in a trace where v = w = 0 holds in the final state only under a weak memory model with write-read relaxations and relaxed write atomicity, e.g. IBM370 or TSO [1]. This behavior is reproducible in our model. The code in Figure 5 models the memory accesses of the program above. In the product TSO, the reordering can be observed by running the ABS model:

thread1: 0 thread2: 0

To enforce this behavior, one may, for example, let the memory system start after 3s and force that reads are executed before writes whenever possible.<sup>2</sup> In the core product, this behavior is not observable. This correlates with the fact that this behavior is not observable in architectures with sequential consistency.

### 5.2 Local Deadlock

A highly desirable property of any memory model is that it should not deadlock, provided that the whole program does not deadlock. The notion of deadlock in ABS is a *circular* dependency between multiple

<sup>&</sup>lt;sup>2</sup>Product TS0Demo, resp. IBMDemo in the downloadable model.

```
1 class T1(Memory mem) implements T {
                                             1 class T2(Memory mem) implements T {
2 Fut<Int> rg;
                                             2 Fut<Int> rg;
3 Unit run() {
                                            3 Unit run() {
  rg = mem.const(0);
4
                                             4
                                                rg = mem.const(0);
   await mem!write(this, Loc("v"), 1);
                                                await mem!write(this, Loc("w"), 1);
5
                                            5
   rg = await mem!read(this,Loc("w"));
                                                rg = await mem!read(this,Loc("v"));
6
                                            6
   println("tr1: "+toString(rg.get));
                                                println("tr2: "+toString(rg.get));
7
                                            7
                                             8 }
8
  }
9 }
                                            9}
```

Figure 5: Encoding of the example as a sequence of memory accesses.

processes [11], where a process  $p_1$  is said to *depend on* a process  $p_2$ , if  $p_1$  is halting at a f.get or await statement and execution of  $p_2$  would allow  $p_1$  to continue. We refrain from introducing the full semantics of deadlocks and execution, but give a proof sketch for the following lemma:

#### Lemma 1

There is no reachable state in any valid ABS program using the class Mem that contains a deadlock consisting only of processes running on an instance of Mem.

*Proof sketch.* First, we observe that there are no f.get statements and only two await statements, both with boolean guards: one in the run method, one in internalRead. Hence, every deadlock can only involve processes running run and internalRead. Second, after the await statement of internalRead, the method has no side effects, therefore, no other process can depend on a process running internalRead. So any deadlock can only involve processes running run. But there is no call of run, so there is only one process running run. There can be no deadlock, because for this at least two processes are needed.

A fully automatic deadlock-detection tool for ABS<sup>3</sup> is available, but it requires a complete system, not merely standalone classes. The code in Figure 5 is analyzed as deadlock-free in the core product, with one false positive *live*lock risk due to the non-terminating loop. Further deadlocks are not possible, because our thread model so far contains only memory accesses, but no synchronization statements.

#### 5.3 Invariants

We show an invariant about internal consistency of our model: there are no two memory accesses with the same id. This is a direct consequence of another invariant: the value of the counter field is always larger than the highest id occurring in the list. We formulate these two invariants in first-order logic:

$$\forall \mathbb{N} \ i, j. \ i \neq j \land i < \mathsf{length}(\mathsf{list}) \land j < \mathsf{length}(\mathsf{list}) \to \mathsf{id}(\mathsf{list}[i]) \neq \mathsf{id}(\mathsf{list}[j]) \tag{1}$$

$$\forall \mathbb{N} \ i. \ i < \mathsf{length}(\mathtt{list}) \to \mathsf{id}(\mathtt{list}[i]) < \mathsf{counter}$$
(2)

These invariants can be shown using the KeY-ABS prover *fully automatically*<sup>4</sup> for the read and write methods. The other methods trivially preserve these properties, as they do not add elements to list nor modify counter. This can be as well verified with KeY-ABS. The proofs for read and write have 640 and 851 steps, respectively, and take a few seconds.

<sup>&</sup>lt;sup>3</sup>http://formbar.raillab.de/deadlock

<sup>&</sup>lt;sup>4</sup>Invariants and KeY-ABS are included with the downloadable model. We had to formalize the theory of Access and List.

# 6 Discussion

Our model conceptually follows the one by Mantel et al. [19], so we mostly compare it to theirs. Their main goal is to establish non-interference results for weak memory models, while ours is to produce a modular, easy-to-follow, and analyzable formalization. Therefore, both approaches are not exclusive, but complementary: Rewriting systems can establish more complex properties, while active object languages help during formalization.

**Modularity.** We are able to provide a small and clear interface to program memory, see Section 4.1. This is difficult to achieve in term rewriting system-based models which do not syntactically distinguish rules for executing language and memory statements. In the model of [19], for example, memory access execution is spread among three different predicates in differing rule premises. Moreover, the parameters of predicates that model *different* aspects are not independent. For example, the predicate describing that an access at position *i* can be executed takes *i* as a parameter, while the predicate describing the read value takes the list [0, ..., i-1] as parameter.

The ABS model we created is in very close correspondence to the modeled concepts. Once the ABS model is in place, these *can be carried over* also to a term rewriting semantics: methods may be mapped to functions or predicates, interfaces to predicate signatures, fields to configuration elements, etc. The three different rule premises for executing an access mentioned above, for example, can be encapsulated into a single function that corresponds to one loop iteration of the run method. The ABS model helps to find such abstractions by enforcing a process-oriented view.

A small and clear interface reduces the risk of clerical errors in the formalization. It is justified by the fact that threads and system memory also have a clear interface in hardware. A concise interface also helps to transfer the memory model to another language. In both models [5, 19] the memory access aspects are polluted with specific constructs from the language in which the memory model resides: registers, special representations of identifiers, etc. An active object model could have helped to encapsulate the language-independent parts better. The ABS model makes no assumption about the implementation language, as it uses futures, not specific concepts that need to be modeled on top, yet it shows where and how modularity can be achieved.

**Product Variant Management.** Our model supports the generation of multiple variants. The modifications that produce these variants are *not visible* in the core product and hence does not pollute it. The standard approach in rewriting systems to support different variants is to parameterize it with underspecified functions or predicates that can be constrained by different axioms. In contrast to this, the product line approach expresses constraints by invariants and declaration of software product lines. This keeps the analysis simple for the core product, i.e. the basic case, as default values for parameters are unnecessary. It alleviates also analyzing the differences between different products, because the modifications are encapsulated in deltas and not scattered around.

**Debugging and Validation.** The deadlock check (Lemma 1) makes not use of a deadlock analysis tool, but of the *concept of a deadlock*. In ABS, the concept of deadlock is non-trivial (due to await on Boolean conditions), but natural, because ABS has notions of progress and unit of computation (process). These notions do not exist in term rewriting systems, making it hard to even state Lemma 1: rewrite rules for operational semantics often have multiple preconditions and it is not obvious which of them may block.

The execution witnesses from the litmus test on instruction reordering in Section 5.1 are an example of minimal requirements one may want to ensure. One can imagine to debug a semantics with the help of a whole test suite. While we did not provide a proof that the core product does not allow certain states, it would have been possible to formulate suitable invariant, similar as in Section 5.3, even though it might not be fully automatic. Active object languages like Rebeca [29] that offer model checking support would deliver a *fully automatic* proof.

In contrast to mainstream general purpose languages, ABS has a fully formal semantics, allows to state formal invariants, and comes with a verification tool that is able to verify such invariants. The invariants in Section 5.3 were shown automatically. Proofs with small amount of interaction might also be acceptable, but the amount of time to invest into formal validation depends on the modeled system.

In general, we conjecture that invariant proofs with model checking is adequate for consistency conditions, but too limited for complex properties such as non-interference, in particular, when *multiple* memory models are involved. Thus, validation can be used to ensure that properties hold during development, but it cannot replace a full formalization.

**Prototyping.** ABS is suitable for quick development: the first model presented above, including the first two validity checks were completed in ca. three modeling hours by one person familiar with ABS. The first model implementing the concepts of [19] including all three validity checks was ready after one more work day (by the same modeler). Further changes for the presentation, bug fixing, and the proof of the invariant took three more hours. Altogether, the whole development presented in this paper took less than two work days, but, of course, we had read and benefited from [5, 19].

The modeler made the experience that it was easier to explain the ABS model than the rewrite systembased model, even to researchers with experience in rewriting systems. Interestingly, one colleague found several bugs in a preliminary version of our model related to reordering, despite being only provided the same code segments that are presented in this paper, and not being familiar with weak memory models.

**Generalization and Limitations.** In this paper we chose the ABS language, mainly because of the similarities between futures and the identifiers of Boudol et al. [5], as well as to perform a validity check through a consistency invariant. Other active object languages would also be possible, for example, Rebeca [29] that comes with a model checker.

The choice of modeling language is influenced by which kind of validity checks are desired and by the desire to have a good match of concepts between the domain and the model. We suspect that active object languages are a less good choice for sequential systems. Furthermore, active object languages enforce strong encapsulation—if the modeled domain does not have strong encapsulation of its elements, then other approaches to prototype distributed systems might be a better fit.

# 7 Conclusion

We demonstrated that a model in an active object language results in a concise and easy-to-comprehend formal model of system memory. While such a formalization cannot replace a full-fledged formalization as a rewriting system or an abstract machine, it can help during modeling core features, as well to communicate and to structure the essential ideas. We showed that *message-based* concurrency is able to model *shared memory-based* concurrency to some extent. Based on this observation, we speculate that by modeling memory explicitly, the tools developed for actors and active objects are applicable for shared memory scenarios, including high performance computing.

**Future Work.** We intend to model in ABS the approach of Bijo et al. [4] for cache architectures and extend it with a write buffer to achieve a weak memory model. We aim to use the ABS tools for resource analysis [2] on different caching strategies to demonstrate that the analytic possibilities of active objects can go beyond what is possible in conventional non-trivial formal models. When doing this, we plan to develop the ABS model and the term rewriting extension of [4] in parallel, to test the prototyping approach advocated here.

The main theorem of Mantel et al. that non-interference under one memory model does not imply non-interference under another memory model is proven by providing a program and a distinguishing condition for each pair of memory models, such that the program fulfills the condition under one memory model, but not under the other. As the provided programs contain no loops, we assume that a model checking approach with Rebeca which would simplify the current proof is feasible.

### References

- Sarita V. Adve & Kourosh Gharachorloo (1996): Shared Memory Consistency Models: A Tutorial. IEEE Computer 29(12), pp. 66–76, doi:10.1109/2.546611.
- [2] Elvira Albert, Puri Arenas, Samir Genaim, Miguel Gómez-Zamalloa & Germán Puebla (2012): COSTABS: A Cost and Termination Analyzer for ABS. In: Proc. ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation, ACM, pp. 151–154, doi:10.1145/2103746.2103774.
- [3] Henry C. Baker, Jr. & Carl Hewitt (1977): *The Incremental Garbage Collection of Processes*. SIGART Bull. (64), pp. 55–59, doi:10.1145/872736.806932.
- [4] Shiji Bijo, Einar Broch Johnsen, Ka I. Pun & S. Lizeth Tapia Tarifa (2016): An Operational Semantics of Cache Coherent Multicore Architectures. In: Proceedings of the 31st Annual ACM Symposium on Applied Computing, SAC '16, ACM, pp. 1219–1224, doi:10.1145/2851613.2851718.
- [5] Gérard Boudol, Gustavo Petri & Bernard P. Serpette (2012): Relaxed Operational Semantics of Concurrent Programming Languages. In Bas Luttik & Michel A. Reniers, editors: EXPRESS/SOS, Proc., EPTCS 89, pp. 19–33, doi:10.4204/EPTCS.89.3.
- [6] Sebastian Burckhardt, Madanlal Musuvathi & Vasu Singh (2010): Verifying Local Transformations on Relaxed Memory Models. In Rajiv Gupta, editor: Compiler Construction: 19th Intl. Conf. CC, Springer, pp. 104–123, doi:10.1007/978-3-642-11970-5\_7.
- [7] Manuel Clavel, Francisco Durán, Steven Eker, Patrick Lincoln, Narciso Martí-Oliet, José Meseguer & Carolyn Talcott (2007): All About Maude - a High-performance Logical Framework: How to Specify, Program and Verify Systems in Rewriting Logic. Springer-Verlag, doi:10.1007/978-3-540-71999-1.
- [8] Crystal Chang Din, Richard Bubel & Reiner Hähnle (2015): KeY-ABS: A Deductive Verification Tool for the Concurrent Modelling Language ABS. In Amy P. Felty & Aart Middeldorp, editors: Intl. Conference on Automated Deduction, LNCS 9195, Springer, pp. 517–526, doi:10.1007/978-3-319-21401-6\_35.
- [9] Crystal Chang Din & Olaf Owe (2015): *Compositional reasoning about active objects with shared futures*. Formal Aspects of Computing 27(3), pp. 551–572, doi:10.1007/s00165-014-0322-y.
- [10] Crystal Chang Din, Silvia Lizeth Tapia Tarifa, Reiner Hähnle & Einar Broch Johnsen (2015): *History-Based Specification and Verification of Scalable Concurrent and Distributed Systems*. In Michael J. Butler, Sylvain Conchon & Fatiha Zaïdi, editors: *Formal Methods and Software Engineering 17th International Conference on Formal Engineering Methods*, ICFEM 2015, Proceedings, Lecture Notes in Computer Science 9407, Springer, pp. 217–233, doi:10.1007/978-3-319-25423-4\_14.
- [11] Antonio Flores-Montoya, Elvira Albert & Samir Genaim (2013): May-Happen-in-Parallel Based Deadlock Analysis for Concurrent Objects. In: Formal Techniques for Distributed Systems, FMOODS/FORTE, pp. 273–288, doi:10.1007/978-3-642-38592-6\_19.

- [12] Reiner Hähnle (2012): The Abstract Behavioral Specification Language: A Tutorial Introduction. In Elena Giachino, Reiner Hähnle, Frank S. de Boer & Marcello M. Bonsangue, editors: Formal Methods for Components and Objects, 11th Intl. Symp., FMCO, Bertinoro, Italy, pp. 1–37, doi:10.1007/978-3-642-40615-7\_1.
- [13] Carl Hewitt, Peter Bishop & Richard Steiger (1973): A universal modular ACTOR formalism for artificial intelligence. In: Proceedings of the 3rd International Joint Conference on Artificial Intelligence, IJCAI'73, Morgan Kaufmann Publishers Inc., pp. 235–245. Available at http://dl.acm.org/citation.cfm?id= 1624775.1624804.
- [14] Gerard J. Holzmann (1991): Design and Validation of Computer Protocols. Prentice-Hall, Inc.
- [15] Einar Broch Johnsen, Reiner Hähnle, Jan Schäfer, Rudolf Schlatte & Martin Steffen (2010): ABS: A Core Language for Abstract Behavioral Specification. In Bernhard K. Aichernig, Frank S. de Boer & Marcello M. Bonsangue, editors: Formal Methods for Components and Objects, 9th Intl. Symp., FMCO, pp. 142–164, doi:10.1007/978-3-642-25271-6\_8.
- [16] Eduard Kamburjan & Reiner Hähnle (2017): Uniform Modeling of Railway Operations. In Cyrille Artho & Peter Csaba Ölveczky, editors: Formal Techniques for Safety-Critical Systems: 5th Intl. Workshop, FTSCS, Revised Selected Papers, Springer, pp. 55–71, doi:10.1007/978-3-319-53946-1\_4.
- [17] Ehsan Khamespanah, Kirill Mechitov, Marjan Sirjani & Gul A. Agha (2016): Schedulability Analysis of Distributed Real-Time Sensor Network Applications Using Actor-Based Model Checking. In Dragan Bosnacki & Anton Wijs, editors: Model Checking Software, 23rd Intl. Symp., SPIN, pp. 165–181, doi:10.1007/978-3-319-32582-8\_11.
- [18] Jia-Chun Lin, Ingrid Chieh Yu, Einar Broch Johnsen & Ming-Chang Lee (2016): ABS-YARN: A Formal Framework for Modeling Hadoop YARN Clusters. In Perdita Stevens & Andrzej Wasowski, editors: Fundamental Approaches to Software Engineering, 19th Intl. Conf., FASE, pp. 49–65, doi:10.1007/978-3-662-49665-7\_4.
- [19] Heiko Mantel, Matthias Perner & Jens Sauer (2014): Noninterference under Weak Memory Models. In: IEEE 27th Computer Security Foundations Symp., CSF, IEEE Computer Society, pp. 80–94, doi:10.1109/CSF.2014.14.
- [20] Radu Muschevici, Dave Clarke & José Proença (2013): Executable modelling of dynamic software product lines in the ABS language. In Andreas Classen & Norbert Siegmund, editors: 5th Intl. Workshop on Feature-Oriented Software Development, FOSD, pp. 17–24, doi:10.1145/2528265.2528266.
- [21] Shin Nakajima & Kokichi Futatsugi (1997): An Object-Oriented Modeling Method for Algebraic Specifications in CafeOBJ. In W. Richards Adrion, Alfonso Fuggetta, Richard N. Taylor & Anthony I. Wasserman, editors: Pulling Together, Proc. 19th Int. Conf. on Software Engineering, pp. 34–44, doi:10.1145/253228.253238.
- [22] Tobias Nipkow, Markus Wenzel & Lawrence C. Paulson (2002): Isabelle/HOL: A Proof Assistant for Higherorder Logic. Springer, doi:10.1007/3-540-45949-9.
- [23] Klaus Pohl, Günter Böckle & Frank J. van der Linden (2005): Software Product Line Engineering: Foundations, Principles and Techniques. Springer-Verlag New York, Inc., doi:10.1007/3-540-28901-1.
- [24] Vijay A. Saraswat, Radha Jagadeesan, Maged Michael & Christoph von Praun (2007): A Theory of Memory Models. In: Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '07, ACM, pp. 161–172, doi:10.1145/1229428.1229469.
- [25] Susmit Sarkar, Peter Sewell, Jade Alglave, Luc Maranget & Derek Williams (2011): Understanding POWER multiprocessors. In Mary W. Hall & David A. Padua, editors: Proceedings of the 32nd ACM SIG-PLAN Conference on Programming Language Design and Implementation, PLDI 2011, ACM, pp. 175–186, doi:10.1145/1993498.1993520.
- [26] Traian Florin Şerbănuţă & Grigore Roşu (2010): K-Maude: A Rewriting Based Tool for Semantics of Programming Languages, pp. 104–122. Springer Berlin Heidelberg, doi:10.1007/978-3-642-16310-4\_8.

- [27] Peter Sewell, Susmit Sarkar, Scott Owens, Francesco Zappa Nardelli & Magnus O. Myreen (2010): X86-TSO: A Rigorous and Usable Programmer's Model for x86 Multiprocessors. Commun. ACM 53(7), pp. 89–97, doi:10.1145/1785414.1785443.
- [28] Zeinab Sharifi, Mahdi Mosaffa, Siamak Mohammadi & Marjan Sirjani (2013): Functional and Performance Analysis of Network-on-Chips Using Actor-based Modeling and Formal Verification. ECEASST 66. Available at http://journal.ub.tu-berlin.de/eceasst/article/view/890.
- [29] Marjan Sirjani, Ali Movaghar, Amin Shali & Frank S. de Boer (2004): Modeling and Verification of Reactive Systems using Rebeca. Fundam. Inform. 63(4), pp. 385-410. Available at http://content.iospress. com/articles/fundamenta-informaticae/fi63-4-05.
- [30] The Coq development team (2004): *The Coq proof assistant reference manual*. LogiCal Project. Available at http://coq.inria.fr. Version 8.0.
- [31] Alexandra Weber (2014): Comparison of an operational and an axiomatic model of execution for multithreaded programs. Master's thesis, TU Darmstadt. Available at https://hds.hebis.de/ulbda/Record/ HEB350072914.